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2.  Abstract/Project  Summary 


Domoic  acid  toxicosis  (DAT)  is  a  major  cause  of  sea  lion  stranding  and  die-offs.  This 
proposal  investigated  sea  lions  with  and  without  DAT  to  determine  whether  a  biomarker  of 
DAT  could  be  determined  from  serum  or  plasma  in  effort  to  enable  the  Navy  marine  mammal 
program  to  screen  animals  for  DAT.  Several  proteomic  approaches  were  utilized  and  data 
were  modeled  using  neural  networks  to  assess  whether  proteins  could  classify  individual  sea 
lions  as  DAT  or  non-DAT.  Commercial  cytokine  arrays  and  MALDI-Tof  peptide  profiling 
can  be  utilized  as  screening  tools  offering  >90%  accuracy  in  the  diagnosis  of  acute  DAT,  but 
will  only  detect  about  25%  of  those  sea  lions  with  DAT.  Alternatively,  these  tools  can  also 
exclude  a  diagnosis  of  DAT  w  ith  >98%  accuracy,  but  will  only  detect  25%  (cytokine  array)  or 
60%  (MALDI-Tof)  of  non-DAT  sea  lions.  2D  gel  electrophoresis  studies  of  chronic  DAT  sea 
lions  demonstrated  that  Apolipoprotein  E  and  eosinophil  counts  combined  in  a  neural  network 
model  resulted  in  a  perfect  marker  of  DAT  in  training  set  samples.  External  validation 
supported  this  combination  as  a  biomarker  with  a  test  sensitivity  of  86%  and  specificity  of 
85%.  This  combination  performed  the  best  of  all  markers  in  this  study. 


3.  Scientific  Technical  Objectives 


1 .  Methodological  development  for  high  abundance  protein  depletion  from  sea  lion  plasma. 

2.  Assess  the  performance  of  commercial  cytokine  arrays  to  predict  domoic  acid  toxicosis  in 
sea  lions.  Assess  the  value  of  Artificial  Neural  Network  analysis  to  enhance  performance 
of  diagnostic  tests.  Validate  biomarkers  using  investigator-blinded  plasma  samples  from 
The  Marine  Mammal  Center. 

3.  Assess  the  performance  of  MALDI-ToF  mass  spectrometry  peptide  profiling  of  serum  to 
predict  acute  domoic  acid  toxicosis  in  sea  lions.  Assess  the  value  of  Artificial  Neural 
Network  analysis  to  enhance  performance  of  diagnostic  tests.  Validate  biomarkers  using 
investigator-blinded  plasma  samples  from  The  Marine  Mammal  Center. 

4.  Assess  the  performance  of  2D  gel  electrophoresis  of  plasma  proteins  to  predict  chronic 
domoic  acid  toxicosis  in  sea  lions.  Assess  the  value  of  Artificial  Neural  Network 
analysis  to  enhance  performance  of  diagnostic  tests.  Validate  biomarkers  using 
investigator-blinded  plasma  samples  from  The  Marine  Mammal  Center. 

5.  Ancillary  Studies:  compare  cerebral  spinal  fluid  proteins  by  tandem  mass  spectrometry 
between  acute  DAT,  chronic  DAT,  and  non-DAT  sea  lions.  Compare  serum  from  DAT 
and  non-DAT  sea  lions  using  iTRAQ. 


4.  Approach 


1.  Methodological  development  for  high  abundance  protein  depletion  from  sea  lion 
plasma. 

California  sea  lions  plasma  samples  are  collected  by  Dr.  Frances  Gulland  and  qualified 
technicians  ate  The  Marine  Mammal  Center,  Sausalito,  CA..  High  abundance  proteins  were 
depleted  from  sea  lion  plasma  utilizing  several  commercially  available  depletion  columns  and 
Proteominer™  ligand  library  bead  columns.  Depletion  ability  was  compared  using  2D  gel 
electrophoresis  to  determine  the  procedure  that  resulted  in  the  most  diverse  number  of  protein 
spots  with  least  amount  of  albumin. 

2.  Assess  the  performance  of  commercial  cytokine  arrays  to  predict  domoic  acid 

toxicosis  in  sea  lions. 

Luminex  cytokine  bead  arrays  were  utilized  to  estimate  the  relative  quantity  of  plasma  or 
serum  cytokines  in  sea  lions  with  DAT  or  without  DAT.  All  data  were  collated  and  statistical 
comparisons  were  made.  Additionally  Artificial  Neural  Networks  were  trained  to  discover 
hidden  relationships  between  cytokine  levels  across  both  sea  lion  groups.  Groups  of  analytes 
that  have  the  greatest  predictive  value  based  on  ROC  analysis  were  validated  in  an 
investigator-blinded  study. 

3.  Assess  the  performance  of  MALDl-ToF  mass  spectrometry  peptide  profiling  of 

serum  to  predict  acute  domoic  acid  toxicosis  in  sea  lions. 

Serum  peptides  from  different  groups  of  sea  lions  (N=107):  1)  Acute  DAT,  2)  Diseased  but 
not  DAT,  3)  Navy  Marine  Mammal  Program  sea  lions;  were  purified  by  cl  8  zip  tip  columns 
and  analyzed  by  MALDI-TOF  mass  spectrometry.  Mass  features  were  aligned,  smoothed,  and 
normalized  across  samples  by  total  ion  intensity  or  glufibrinogen  internal  control  peptide 
(Progenesis  MALDI).  Individual  masses  were  assessed  as  biomarkers  using  area  under 
receiver  area  operator  characteristic  (AuROC)  curves.  Combinations  of  features  were  assess 
as  markers  using  combinations  of  artificial  neural  networks.  Best  performing  features  or 
models  were  validated  using  an  investigator  blinded  test  set  of  serum  samples  from  The 
Marine  Mammal  Center  (N=20). 


4.  Assess  the  performance  of  2D  gel  electrophoresis  of  plasma  proteins  to  predict  chronic 
domoic  acid  toxicosis  in  sea  lions. 

Plasma  samples  from  20  sea  lions  were  depleted  and  analyzed  by  large  format  2D  gel 
electrophoresis.  Gels  were  stained  with  SyproRuby  stain  and  aligned  by  Same  Spots 
(Progenesis).  Differentially  abundant  protein  spots  were  removed  for  identification.  Spot 
volumes  were  further  analyzed  for  individual  AuROC  curve  performance.  Proteins  in  excess 
of  0.7  were  used  to  train  artificial  neural  networks.  Top  networks  were  validated  using  an 
external  test  set  (N=10).  Attempts  to  identify  all  spots  utilized  in  model  building  were  made 


by  tandem  mass  spectrometry.  Attempts  to  validate  2D  gel  results  were  made  using  western 
blotting. 

5.  Ancillary  Studies:  compare  cerebral  spinal  fluid  proteins  by  tandem  mass 
spectrometry  between  acute  DAT,  chronic  DAT,  and  non-DAT  sea  lions.  Compare 
serum  from  DAT  and  non-DAT  sea  lions  using  iTRAQ. 

A  pilot  study  to  investigate  cerebral  spinal  fluid  as  a  potential  source  of  markers  for  DAT  was 
undertaken  to  utilize  new  technology  in  the  laboratory.  CSF  proteins  were  isolated  from 
acute,  chronic,  and  non  DAT  sea  lions.  Proteins  were  identified  by  tandem  mass 
spectrometry.  Protein  differences  were  compared  using  label-free  spectral  counting.  Assays 
for  quantitypic  peptides  for  6  CSF  proteins  were  developed.  Synthetic  peptide  standards  were 
customized  and  performance  measures  of  the  assays  were  calculated.  Limit  of  detection,  Limit 
of  quantification,  precision  and  standard  response  curves  were  calculated  for  these  peptides 
for  the  measurement  of  sea  lion-specific  proteins  in  cerebral  spinal  fluid. 

A  second  pilot  study  was  undertaken  using  iTRAQ  labeling  of  serum  proteins  from  sea  lions 
with  DAT  and  non-DAT  to  compliment  the  2D  gel  project.  Serum  samples  were  digested  and 
analyzed  by  LC/MS/MS.  Differences  in  proteins  were  estimated  using  balanced  reporter  ions. 


5.  Accomplishments 


1.  Methodological  development  for  high  abundance  protein  depletion  from  sea  lion  plasma. 

In  plasma  biomarkers  studies  of  mammals,  albumin,  immunoglobulins,  and  18  additional 
proteins  comprise  99%  of  the  total  plasma  protein.  This  leaves  1%  of  the  plasma  proteins 
relatively  invisible  to  detection  by  proteomic  techniques.  In  order  to  investigate  a  wide  range  of 
proteins  as  biomarkers  for  California  sea  lion,  many  of  the  abundant  protein  must  be  depleted. 
There  are  several  commercially  available  columns  which  have  been  thoroughly  investigated  in 
mice,  rats,  chimpanzees,  and  humans;  however,  these  columns  remain  untested  for  marine 
mammals.  To  resolve  the  issue  of  high  abundance  plasma  protein  depletion  in  sea  lion,  we  set 
out  to  test  commercially  available  depletion  columns  and  determine  their  suitability  for  sea  lion 
plasma  protein  depletion.  Immunodepletion  columns  are  most  commonly  utilized  in  plasma  and 
serum  proteomic  studies,  but  these  columns  are  usually  highly  specific  for  the  animal  to  which 
the  antibodies  were  raised.  Therefore,  we  also  chose  to  investigate  a  non-species  specific  column 
such  as  the  AURUM  albumin/Ig  depletion  column  (Bio-Rad).  In  December  2007,  a  new  column 
was  released  by  Bio-Rad  called  Proteominer  Ligand  Library  bead  depletion  columns.  These 
columns  were  not  initially  selected  because  they  were  novel  and  untested;  however,  we  decided 
to  include  these  columns  in  the  sea  lion  study  June  2008  following  a  series  of  testing  of  the 
Proteominer  columns  in  the  Nephrology  Proteomics  Lab  at  MUSC  (The  PI  is  assoc,  director  of 
this  lab). 

Depletion  columns  were  tested  on  frozen  plasma  samples  collected  at  The  Marine  Mammal 
Center  and  sent  to  MUSC  on  dry  ice.  Blood  samples  were  collected  in  citrate  tubes  to  prevent 
coagulation  prior  to  centrifugation.  No  protease  inhibitors  were  utilized  in  this  test  set.  Initially 
three  depletion  columns  were  tested  for  ability  to  deplete  albumin  and  immmunoglobulins  from 
sea  lion  plasma  samples.  Aurum  serum  protein  mini-kit  (Bio-Rad),  Proteoprep20  plasma 
immunodepletion  kit  (Sigma  Chemical),  and  Proteomelab  IgY  plasma  depletion  kit  (Beckman) 
were  chosen  as  the  test  depletion  columns  after  consulting  with  companies  that  have  tested  these 
columns  on  canine  and  farm  animal  serum  samples.  Given  sea  lions  are  more  closely  related  to 
canines,  this  was  our  rationale  for  choosing  these  products. 

Plasma  samples  were  prepared  by  filtration  through  0.1  pM  filters  to  remove  any  cell  debris  and 
bacteria.  Following  filtration,  200-500pg  protein  was  depleted  according  to  specific 
manufacturer  protocols.  Immunodepletion  columns  should  bind  albumin  and  any  high  abundance 
protein  to  which  conjugated  antibodies  are  present  in  the  column.  Aurum  columns  are  not 
immunodepletion  columns,  but  contain  a  resin  which  has  been  shown  to  bind  to  albumin  and 
immunoglobulins,  irrespective  of  species.  Depleted  samples  were  precipitated  in  5vol.  acetone 
and  washed  with  75%  ethanol.  Protein  concentration  was  measured  by  colorimetric  assay 
(Bradford  method,  BioRad)  and  50pg  protein  was  added  to  a  compatible  buffer  for  2DE  (7M 
urea,  2M  thiourea,  2%  CHAPS,  0.2%  Biolytes,  1  %  DTT).  Proteins  were  loaded  onto  an  1 1  cm 
IPG  strip  (pH  3-10)  and  focused  in  a  Protean  IEF  cell  (Bio-Rad,  Hercules,  CA  )  for  100,000 
Volt-hours  with  a  maximum  voltage  of  8000  Volts  and  a  maximum  current  of  50  pA/strip.  After 
focusing,  proteins  were  separated  by  SDS  polyacrylamide  gel  electrophoresis  on  an  8-16% 
gradient  gel  using  a  Criterion  Doceca  cell  (Bio-Rad).  Gels  were  washed  with  deionized  water, 
fixed  with  10%  methanol/7%  acetic  acid,  stained  overnight  in  the  dark  with  Sypro  Ruby 


(Invitrogen  Molecular  Probes,  Carlsbad,  CA),  destained  with  1 0%  methanol/7%  acetic  acid  and 
imaged  on  an  FX  Pro  Plus  fluorescent  imager  (Bio-Rad).  Images  were  analyzed  using  PDQuest 
software  version  7. 1 .  Spots  were  automatically  detected  and  matched  followed  by  manual  editing 
of  spots  and  spot  alignment  by  an  experienced  user  to  improve  detection  and  eliminate  artifacts. 
Spot  intensity  was  normalized  to  global  intensity.  Figure  1  depicts  2D  gels  from  undepleted, 
Aurum  column  depleted,  ProteomeLab  IgY  column  depleted,  and  Proteoprep  20  column 
depleted  samples.  High  abundance  spots  that  corresponded  to  the  size  and  PI  of  Albumin  were 
picked  by  robot  and  tryptic  digested  for  mass  spectrometric  identification.  Albumin  was 
identified  by  MALDI-TOF-TOF  mass  spectrometry  using  the  MASCOT  algorithm  which 
matched  MS/MS  data  to  canine  albumin  (score  not  shown,  but  was  significant).  Albumin  was 
then  quantified  and  the  normalized  intensity  compared  across  gels.  For  all  columns,  the 
ProteomeLab  IgY  performed  the  best  resulting  in  94%  fractional  depletion  of  albumin.  The  other 
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Figure  1.  Two-dimensional  gels  showing  plasma  proteins  from  A)  undepleted  sea  lion 
plasma.  B)  Aurum  column  depleted  plasma.  C)  ProteomeLab  IgY  columns.  D) 
Proteoprep  columns.  Numbers  indicate  fractional  abundance  of  albumin  from  the 
depleted  sample  to  the  undepleted  sample  (spots  normalized  to  total  spot  intensity 
within  the  gel).  Albumin  depletion  was  greatest  by  ProteomeLab  IgY  column. 

two  columns  did  not  deplete  albumin  as  well  as  the  ProteomeLab  IgY  columns,  but  did  appear  to 
result  in  fewer  immunoglobulin  proteins  (large  horizontal  streaks 


Although  IgY  columns  appear  to  be  the  best  choice  for  high  abundance  protein  depletion  for  sea 
lion  plasma  -  there  are  two  major  issues  with  this  technique:  1)  immunodepletion  columns  must 
be  reused  and  are  prone  to  incomplete  recharging  which  results  in  cross-contamination  between 

samples  as  the  column  is  reused  over 
and  over.  2)  Antibodies  are  proteins 
and  over  time,  the  proteins  will 
degrade  leading  to  incomplete 
depletion  and  inconsistent  results.  In 
an  effort  to  avoid  these  problems,  we 
decided  to  test  albumin  and  IgG 
depletion  using  Proteominer  ligand 
library  bead  columns.  Ligand  library 
beads  are  hexameric  peptides 
conjugated  to  agarose  or  sepharose 
beads.  The  idea  behind  ligand  library 
beads  is  that  many  plasma  proteins 
bind  peptides  and  protein  domains. 
Instead  of  creating  a  group  of 
peptides  to  which  every  plasma 
protein  will  bind,  a  random  hexamer 
library  was  created  so  that  all  proteins  will  have  a  chance  to  bind  to  a  specific  peptide.  Because 
there  is  a  finite  number  of  ligands,  proteins  that  are  in  excess  will  not  have  a  chance  to  bind  and 
will  flow  through  the  column.  Proteins  that  are  in  lesser  abundance  will  bind  completely  to  the 
column.  After  elution,  high  and  low  abundance  proteins  will  be  more  equally  represented.  One  of 
the  best  aspects  to  these  columns  is  that  they  are  single  use  and  therefore  not  subject  to  cross 
contamination  and  degradation  after  many  uses  is  no  longer  an  issue. 

lmL  of  plasma  (50mg/ml)  was  loaded  onto  a  Proteominer  column  and  bound  proteins  eluted 
according  to  the  manufacturer's  protocol.  Because  the  Proteominer  elution  buffer  is  not  optimal 
for  2D  gel  electorphoresis,  we  were  only  able  to  load  15pgs  protein  onto  the  2D  gel.  Figure  2 
shows  a  Proteominer  depleted  plasma  sample.  Albumin  was  difficult  to  observe  and  the  area 
where  albumin  should  reside  is  encircled.  Proteomic  analysis  of  the  top  15  abundant  plasma 
proteins  from  sea  lions  using  LC/MS/MS  pre  and  post  proteominer  depletion  demonstrated  that 
the  Proteominer  depletion  strategy  provided  ample  reduction  in  albumin  and  immunoglobulins. 
These  ligand  library  bead  columns  consistently  provided  the  best  depletion  of  albumin  and 
several  high  abundance  proteins  compared  to  three  other  immunoaffinity  columns  available 
through  commercial  vendors. 


2.  Assess  the  performance  of  commercial  cytokine  arrays  to  predict  domoic  acid  toxicosis  in 
sea  lions. 

This  aim  underwent  several  iterations  as  the  project  transitioned  from  the  graduate  student  to  the 
postdoctoral  associate.  Preliminary  findings  from  2009/2010  demonstrated  that  cytokine 
multiplex  analyses  of  sea  lion  serum  using  human  bioplex  27  cytokine  panels  (Bio-Rad)  could 
offer  an  accurate  test  to  discriminate  between  sea  lions  with  DAT  and  those  without  DAT. 
Artificial  neural  networks  were  utilized  to  create  models  which  together  with  27  cytokines  would 


be  capable  of  diagnosing  sea  lions  with  DAT  or  non-DAT.  Although  accurate  for  classifying  sea 
lions  without  DAT  (negative  predicted  value  =  100%),  the  test  was  not  specific  (18%).  We 
interpreted  this  test  to  mean:  if  the  test  is  negative  then  the  diagnosis  was  highly  confident  that 
the  sea  lion  was  negative  for  DAT,  but  less  than  20%  of  those  non-DAT  sea  lions  would  be 
detected  in  this  manner. 

In  2010,  we  intended  to  elevate  the  perfonnance  of  this  test  by  including  more  sea  lions 
in  the  training  set  of  the  artificial  neural  network  as  well  as  creating  groups  of  sea  lions  which 
were  frequency  matched  so  that  the  sea  lion  stranding  population  would  be  better  represented  as 
a  whole  in  tenns  of  gender  and  age.  Prior  to  this  task  it  was  necessary  to  create  an  itemized  and 
collated  sample  database.  In  this  study  sera  from  110  sea  lions  [35  acute  DAT,  75  nonDAT] 
were  screened  using  the  human  27-cytokine  panel  (Bio-Rad)  using  identical  methods  established 
by  the  graduate  student  in  2009.  Samples  obtained  from  the  Marine  Mammal  Center  were 
frequency  matched  and  20  serum  samples  from  the  Navy  Marine  Mammal  Program  were 
included  to  ensure  representation  from  sea  lions  that  are  known  not  to  have  a  history  of  domoic 
acid  toxicosis  with  a  high  degree  of  certainty.  Sample  clinical  data  are  approximately  identical 
for  that  described  in  the  MALDI  profiling  experimental  methods  section  below. 

Individual  cytokines  were  measured  with  the  assumption  that  cross-reactivity  between 
sea  lion  samples  would  be  equal  if  not  accurate  given  that  the  antibodies  were  designed  to  human 
proteins.  No  cytokine  was  an  individual  predictor  of  domoic  acid  toxicosis  as  area  under  receiver 
operator  characteristic  (AuROC)  curves  were  less  than  0.62. 

Artificial  neural  networks  were  created  to  find  multidimensional  relationships  between 

the  data  that  could  be  utilized  to 
discriminate  between  DAT  and 
non-DAT.  For  this  effort,  we 
automated  model  generating 
capabilities  such  that  101 
neural  networks  could  be 
rapidly  constructed  and  tested 
thereby  allow  us  to  compare 
model  perfonnance  across  a 
larger  number  of  models.  In 
addition,  we  noticed  that  some 
cytokines  were  not  always 
detectable  in  a  majority  of 
samples.  In  reaction  to  this 
result,  we  also  created  models 
containing  cytokines  that  were 
present  in  80%  and  90%  of  sea 
lion  samples.  Further,  we 
trained  models  using  both  raw 
and  quantiled  data.  Data 
quantiling  was  utilized  to 
enhance  signal  to  noise.  In  all, 
606  models  were  created  from 
the  raw  and  pennutated  data 
sets  combined.  AuROC  curves 


Exclusion  of  DAT  Prediction  of  DAT 

Quant 

Quant 

Quant 

9  cytokines 

19  cytokines 

9  cytokines 

Model# 

33 

49 

33 

AUC 

0.94 

0.96 

0.94 

Threshold 

IE-15 

7E-0.6 

09999 

Training 

Perfonnance 

Sens 

1.00 

0.97 

0.94 

Spec 

0.09 

0.68 

0  96 

PPV 

0.34 

0.59 

0.92 

NPV 

1.00 

0.98 

0.97 

Qualification  Performance 

Sens 

1.00 

too 

0.30 

Spec 

0.17 

0.25 

1.00 

PPV 

0.50 

0.53 

1.00 

NPV 

1.00 

1.00 

0.63 

Table  1.  Statistical  performance  measures  calculated  for  the 
training  set  and  test  set  from  the  two  best  performing 
models.  Model  49  was  the  best  performing  model  for  the 
exclusion  of  DAT  (100%NPV)  based  upon  the  test  set.  Model 
33  was  the  best  performing  for  the  prediction  of  DAT  (100% 
PPV)  based  upon  the  test  set. 


were  utilized  to  estimate  training  performance  and  four  thresholds  were  chosen  to  calculate 
statistical  performance  measures  based  on  optimal  threshold,  minimal  misclassification,  highest 
negative  predictive  value  and  highest  positive  predictive  value. 

To  estimate  performance  of  each  model,  we  utilized  serum  samples  from  20  sea  lions  that 
were  not  utilized  to  create  the  model  (Test  set).  Samples  were  blinded  to  the  investigator  and 
data  collected  in  the  identical  manner  as  done  for  the  training  set.  For  trained  models  using 
quantiled  data,  the  test  data  set  was  converted  prior  to  prediction.  Sensivity,  specificity,  negative 
predictive  value  and  positive  predictive  value  was  calculated  for  all  test  set  data  at  each  threshold 
calculated  from  the  training  set  ROC  curves.  Table  1  highlights  two  ‘best’  models  for  the 
exclusion  of  DAT  or  prediction  of  DAT.  For  the  exclusion  of  DAT  model  49  utilized  19 
cytokines  which  were  quantiled.  The  test-set  performance  estimates  a  specificity  of  25%  with  a 
100%  negative  predictive  value.  The  negative  predictive  value  for  the  training  set  performance 
was  also  high  (98%)  suggesting  that  19  cytokines  together  with  neural  network  modeling  can 
find  sea  lions  without  DAT  with  high  accuracy  (>98%),  but  with  very  low  sensitivity  (25%)  For 
the  prediction  of  DAT,  model  33  was  the  ‘best’  and  utilized  9  quantiled  cytokines.  Statistical 
performance  measures  calculated  for  the  test  set  and  training  set  suggest  that  this  test  is  also 
highly  accurate  and  can  predict  DAT  with  >92%  confidence,  but  will  only  detect  30%  of  those 
sea  lions  with  DAT. 

Overall,  we  believe  that  cytokine  analysis  and  artificial  neural  networks  can  be  utilized  as 
a  rapid  discriminatory  screening  tool.  Depending  upon  the  model  being  utilized  and  the 
screening  need,  cytokine  profiling  can  offer  a  level  of  confidence  to  a  diagnosis  or  exclude  a 
diagnosis  of  DAT,  hut  low  sensitivity  or  specificity  limit  the  utility  of  this  tool  in  population¬ 
wide  studies. 

3.  Assess  the  performance  of  MALDI-ToF  mass  spectrometry  peptide  profiling  of  scrum  to 
predict  acute  domoic  acid  toxicosis  in  sea  lions.  Assess  the  value  of  Artificial  Neural 
Network  analysis  to  enhance  performance  of  diagnostic  tests.  Validate  biomarkers  using 
investigator-blinded  plasma  samples  from  The  Marine  Mammal  Center. 

As  described  in  the  specific  aim  1  of  the  original  proposal,  MALDI-TOF  peptide 
profiling  was  completed  for  107  sea  lions  to  estimate  whether  features  in  the  serum  could 
accurately  classify  sea  lions  with  DAT.  The  selection  of  serum  samples  were  greatly  enhanced 
by  the  creation  of  the  serum  sample  database  constructed  in  August/September  2010.  This  aim  is 
considered  complete  by  the  Pis  and  co-PIs;  however,  we  are  currently  working  towards 
identifying  individual  peptides  to  determine  whether  any  insight  can  be  gained  into  mechanisms 
of  DAT  and  whether  protein  degradation  products  may  reflect  potential  parent  protein 
biomarkers  in  the  serum.  Although  we  realize  that  MALDI-TOF  profiling  may  be  technically 
difficult  and  is  not  considered  a  point-of-care  diagnostic,  we  feel  that  this  test  can  be  translated 
into  a  centralized  laboratory  setting  for  send-out  diagnostics.  A  majority  of  effort  from  2010  was 
placed  on  completing  MALDI-TOF  profiling. 

Experimental  design  and  methods. 

Inclusion  Criteria.  Serum  samples  were  acquired  from  the  Marine  Mammal  Center  (TMMC; 
Sausalito,  CA)  and  the  U.S.  Navy  Marine  Mammal  Program  (USNMMP;  San  Diego,  CA). 
Samples  from  the  USNMMP  had  no  selection  criteria  applied  and  were  placed  in  an  independent 
group  (NAVY;  n=20).  These  samples  were  collected  between  2000  and  2008,  and  at  the  time  of 


sampling  7  of  20  exhibited  clinical  signs,  1 5  of  20  were  fasting,  and  7  of  20  were  initial  samples 
taken  upon  admission  to  the  USNMMP.  Inclusion  criteria  were  applied  to  available  samples  at 
TMMC  using  available  clinical  parameters  available.  We  retrospectively  identified  serum 
samples  collected  from  2,343  live  California  sea  lions  that  stranded  along  the  central  California 
coast  between  2005  and  2010.  Of  these,  only  sera  which  were  drawn  within  seven  days  of 
admission  to  TMMC  were  included,  which  included  sera  from  approximately  2,000  sea  lions. 
We  included  sera  from  both  sexes  and  adult,  subadult,  juvenile,  and  yearling  age  classes  while 
attempting  to  frequency  match  these  criteria  between  groups  in  the  training  set.  Diagnoses  were 
retrospectively  confirmed  and  sera  were  placed  into  two  groups:  those  suffering  from  acute 
domoic  acid  toxicosis  (acute  DAT  group),  individuals  asymptomatic  for  DAT  (non-DAT). 
Individuals  with  DAT  were  identified  based  on  clinical  signs  such  as  seizures  or  neurological 
symptoms  and  the  presence  of  domoic  acid  in  bodily  fluids  (urine,  feces,  milk,  aqueous  humor) 
provided  additional  DAT  confirmation  in  some  cases.  Acute  DAT  cases  were  differentiated 
from  individuals  with  chronic  DAT  or  with  available  brain  histology  (atrophy  indicated  chronic 
DAT).  Furthermore  individuals  placed  in  the  acute  DAT  group  could  not  progress  to  chronic 
DAT.  Individuals  with  acute  DAT  as  well  as  an  additional  confounding  etiology  such  as 
carcinoma  or  leptospirosis  infection  were  rejected. 

The  non-DAT  group  could  not  have  seizures  or  other  neurological  problems  during  their 
time  in  rehabilitation  (regardless  of  etiology)  or  later  strand  with  signs  of  DAT.  Available  DA 
results  in  bodily  fluids  were  negative,  and  available  histology  could  not  indicate  any 
hippocampal  atrophy.  This  group  included  those  suffering  from  renal  failure  associated  with 
Leptospira  interrogans  (leptospirosis  sub-group),  and  individuals  without  signs  of  exposure  to 
either  domoic  acid  or  leptospirosis  (non-DAT/non-lcptospirosis  group).  Blood  chemistry  was 
used  to  confirm  leptospirosis  such  that  individuals  with  BUN  >  100,  Na  >  150,  creatinine  >  2, 
and  P  >  Ca  were  classified  as  having  leptospirosis.  Two  individuals  placed  in  the  leptospirosis 
sub-group  did  not  meet  these  criteria  (CSL  9332  did  not  have  Na>150  and  CSL  7595  did  not 
have  creatine>2  or  P>Ca),  however  leptospirosis  was  suspected.  There  were  1 1  individuals  that 
did  not  have  blood  work  results  available  to  confirm  the  absence  of  leptospirosis,  but  given  the 
absence  of  indications  of  leptospirosis  they  were  placed  into  the  acute  DAT  group  and  the  non- 
DAT/leptospirosis  group.  Additionally,  individuals  not  in  the  leptospirosis  sub-group  could  not 
have  had  post-mortem  observations  characteristic  of  leptospirosis  infection,  such  as  swollen,  pale 
kidneys  and  poor  renicular  differentiation  at  gross  necropsy  or  interstitial  nephritis  on  histology. 

Exclusion  Criteria.  To  limit  confounding  variables,  known  pregnant  females  (i.e.,  those  that 
aborted  in  rehabilitation  or  with  a  fetus  in  uterus  at  necropsy)  or  individuals  with  significant 
trauma  (e.g.,  missing  limbs  or  life  threatening  wounds)  were  excluded  from  the  study.  Sera  that 
were  drawn  more  than  7  days  after  admission  or  collected  by  heart-stick  or  post-mortem  were 
not  included.  In  addition,  serum  samples  that  were  not  archived  at  -80°C  the  day  of  collection 
were  not  used. 

Serum  Collection  and  Storage.  Sea  lions  admitted  to  TMMC  had  blood  drawn  into  tiger  top 
vacutainers,  centrifuged,  and  the  serum  was  decanted,  aliquoted  into  freezer  vials  and  frozen  at  - 
80°C  the  same  day  blood  was  drawn.  Serum  samples  were  collected  from  sea  lions  in  the 
USNMMP,  allowed  to  clot  for  30  to  60  min  before  centrifuging.  Less  than  7h  passed  between 
clotting  and  storage  at  -80°C.  Some  samples  were  taken  from  anaesthetized  individuals,  9  of  20 
for  USNMMP  and  an  undetermined  number  of  samples  from  TMMC.  The  internal  training 


sample  set  and  independent  test  set  were  handled  independently  once  received  from  the 
USNMMP  or  TMMC. 

Experimental  Design.  Sera  from  the  training  set  (n=l  07)  were  extracted  and  analyzed  over  two 
days.  To  limit  the  effect  of  interday  MALDI-TOF  variability,  two  groups  were  generated.  This 
was  accomplished  by  first  using  a  randomized  list  of  the  107  sera  to  separate  two  groups  which 
were  then  balanced  by  the  two  main  groups  (DAT  and  non-DAT),  the  three  non-DAT  groups 
(non-DAT/leptospirosis,  non-DAT/non-leptospirosis  group,  and  NAVY),  sex,  age,  year,  and 
outcome  (release  or  euthanasia/death).  Day  one  consisted  of  55  sera  from  22  males  (3  DAT,  5 
leptospirosis,  4  other,  and  10  NAVY),  and  33  females  (12  DAT,  2  leptospirosis,  18  other  )  of 
which  51%  were  released.  Day  two  consisted  of  53  sera  from  21  males  (3  DAT,  5  leptospirosis, 
3  other  and  10  NAVY)  and  32  females  (15  DAT,  3  leptospirosis,  and  14  other)  of  which  47% 
were  released.  The  independent  test  set  (n=20)  was  blinded  (identities  and  diagnoses)  to  the 
investigator  until  after  analysis  using  a  number  scheme,  and  was  run  during  one  day. 

Peptide  Extraction.  Each  sera  was  diluted  to  0. 1  %  (v/v)  trifluoroaeetie  acid  (TFA)  using  1 00  pL 
of  0.15%  (v/v)  TFA  (Thermo  Scientific,  Rockford,  IL).  After  5  min  incubation  at  room 
temperature,  1 0  pL  of  C8-magnetic  beads  (ClinPnft™  Profiling  Kit,  Bruker  Daltonics,  Billerica, 
MA)  was  added,  followed  by  three  wash  steps  of  100  pL  0.1%  (v/v)  TFA  according  to 
manufacturer's  guidelines.  Peptides  were  eluted  with  20  pL  of  50%  acetonitrile  in 
manufacturer's  stabilization  buffer  and  15  pL  was  transferred  to  a  clean  tube.  Finally,  30  pL  of 
matrix  [5  mg  mL'1  a-cyano-4-hydroxycinnamic  acid  (Bruker  Daltonics)  in  HPLC  grade 
methanol:acetonitrile:watcr  (5:4:1)  containing  25  nM  glu-l-fibrinopeptide  peptide  mass  standard 
(Glu-Fib;  Protea  Biosciences,  Inc.,  Morgantown,  WV)]  was  added,  mixed,  and  2  pL  of  the 
resulting  solution  was  spotted  onto  a  ground  steel  target  plate  (MTP  384  ground  steel  T  F  plate, 
Bruker  Daltonics). 

Spectra  Acquisition.  Matrix  assisted  laser  desorption  ionization  time  of  flight  (MALDI-TOF) 
spectra  were  acquired  using  a  Bruker  AutoflexIII.  The  raw  spectra  were  then  imported  into 
Progenesis  MALDI  (Nonlinear  USA  Inc.,  Durham,  NC).  Peak  intensities  were  normalized  to  the 
internal  standard  (Glu-Fib)  or  Total  Ion  Current  (TIC)  and  analyzed  separately  based  on 
nonnalization  procedure.  The  independent  test  set  was  processed  the  same  as  the  training  set, 
and  the  training  set  spectra  were  used  to  facilitate  alignment  of  the  test  set. 

Receiver  operator  characteristic  (ROC)  Curve  Analysis.  Samples  in  the  training  set  were 
dichotomized  such  that  samples  from  the  DAT  group  had  an  input  of  1  and  samples  from  the 
non-DAT  had  an  input  of  0,  meaning  that  a  positive  indicates  DAT. 

Artificial  Neural  Network  Analysis.  The  artificial  neural  network  (ANN)  algorithm  was  trained 
using  Glu-Fib  or  TIC  normalized  peak  data  from  the  training  dataset.  Three  different  approaches 
were  used:  (i)  the  training  set  was  divided  into  a  sub-training  set  and  sub-qualification  set  for 
cross-validation,  (ii)  the  full  training  set  was  used,  and  (iii)  the  normalized  peak  data  from  the 
full  training  set  was  ranked  across  samples  and  expressed  as  quantiles.  These  training  sets  were 
used  independently  to  train  101  feed-forward  ANNs  performed  by  Matlab.  The  median 
performing  ANN(s)  was  selected  using  AuROC,  as  well  as  a  Combinatorial  ANN  (CANNioi) 
that  was  the  average  of  101  ANNs.  The  external  qualification  dataset  was  processed  identical  to 


the  training  set,  with  quantiles  being  determined  using  the  training  set  rankings.  A  priori 

threshold  values  used  for 
qualification  were  determined 
differently  based  on  the 
approach.  In  the  first 
approach,  the  internal 
qualification  dataset  was  used 
to  detennine  threshold  values 
for  either  single  ANNs  or 
CANNjoi  and  the 
independent  test  set  was  run 
against  ANNs  trained  on  the 
sub-training  dataset  or  the 
complete  training  dataset.  In 
the  second  and  third 
approach,  thresholds  were 
determined  by  ROC  analysis 
of  the  training  set 


Findings.  Training  set  serum  samples  from  TMMC  (Table  2)  were  collected  between  2005  and 
2010,  with  11.5%  (n=10),  6.9%  (n=6),  16.1%  (n=14),  17.2%  (n=15),  35.6%  (n=31)  and  12.6% 
(n=ll)  from  2005  to  2010  respectively.  Although  we  defined  3  subgroups  of  the  non-DAT 
group,  for  this  study,  they  arc  treated  as  two  groups:  DAT  and  non-DAT.  Since  the  majority  of 
DAT  samples  available  were  from  females,  we  attempted  to  frequency  match  the  two  groups. 
Furthermore,  individuals  from  TMMC  in  the  non-DAT  group  were  selected  to  reflect  etiologies 
common  to  stranded  sea  lions  admitted  to  TMMC,  which  between  1991  and  2000  of  3,379  non- 
DAT  individuals,  malnutrition,  leptospirosis,  trauma,  and  miscellaneous  comprised  35%,  30%, 
19%,  and  11%  of  cases  respectively  with  carcinoma  present  in  3%  of  cases.  In  addition  to  the 
107  sera  used  as  a  training  set,  we  used  an  independent  test  set  of  20  sera  for  qualification,  and 
the  identities  and  diagnoses  were  blinded  to  the  investigator  until  after  analysis.  These  20  sera 
were  from  2007  to  2010  were  chosen  to  include  10  DAT  and  10  non-DAT  and  in  general  reflect 
the  types  of  cases  seen  at  TMMC. 


Table  2.  Sea  Lion  frequency  distribution:  training  set. 

DAT 

non-DAT 

NAVY 

Total 

34  (31.8%) 

53  (49.5%) 

20(18.7%) 

Male 

6(17.6%) 

16(30.2%) 

20(100%) 

Female 

28  (82.4%) 

37  (69.8%) 

0 

Age 

pup 

0 

0 

0 

yearling 

1  (2.9%) 

8(15.1%) 

5  (25%) 

juvenile 

2  (5.9%) 

6(11.3%) 

7  (30%) 

sub-adult 

3  (8.8%) 

15  (28.3%) 

2(10%) 

adult 

28  (82.4%) 

24  (45.3%) 

6  (30%) 

Euthansia/Death 

21  (61.8%) 

33  (62.3%) 

0 

Hematological  and  serum  biochemistry  data  were  available  for  serum  collected  from  76 
of  the  87  individuals  from  TMMC  and  all  20  of  the  serum  collected  at  the  USNMMP,  which 
corresponds  to  the  draw  date  of  sera  used  for  peptide  profiling.  Individuals  in  the  DAT  group 
had  significantly  higher  levels  of  red  blood  cell  counts,  hemoglobin,  and  hematocrit  than  the 
non-DAT  group  (1.1  to  1 . 2-fold)  despite  lower  levels  of  BUN  and  BUN '.creatine  ratios  (-5  and  -2 
fold,  respectively).  The  DAT  group  also  had  lower  levels  of  white  blood  cells  and  banded 
neutrophils  (-1.6  and  -2.5fold),  but  increased  levels  of  lymphocytes  and  eosinophiles  (1.3  and 
2.4-fold).  Levels  of  Na,  Cl,  Mg,  P  and  Na/K  ratios  were  lower  in  the  DAT  group  (-1.1,  -1.1,  - 
1.3,  -1.6,  and  -1.1 -fold),  while  K  was  higher  in  the  DAT  group  (1.1  fold).  Lastly,  albumin  was 
higher  (1.3fold)  and  triglycerides  and  sorbitol  dehydrogenase  were  lower  (-3.3  and  -1.4  fold)  in 
the  DAT  group. 
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Figure  3.  Principal  components  analysis  of  104  MALDI  peaks.  Animals  were  designated  as 
DAT,  leptospirosis,  other  (other  ailment),  or  NAVY  (marine  mammal  program)  to  allow  fine 
discriminatory  analysis  of  the  groupings. 


Principle  components  analysis  was  utilized  to  determine  whether  exploratory  analysis  could 
discriminate  between  groups  such  as  DAT,  Leptospirosis,  non-DAT  without  Leptospirosis,  and 
NAVY  sea  lions  (Figure  3).  Plotting  two  principle  components  based  on  104  MALDI  peaks 
showed  considerable  spatial  overlap  between  DAT  and  non-DAT  groups.  Only  sea  lions  with 
leptospirosis  and  those  from  the  NAVY  Marine  Mammal  Program  were  distinctly  separate, 
suggesting  that  features  common  to  sea  lions  in  general  are  also  common  to  those  with  DAT.  For 
this  reason,  supervised  learning  algorithms  were  the  next  logical  step  to  discern  whether 
multidimensional  relationships  could  be  informative  in  the  discrimination  of  DAT  from  non- 
DAT. 

To  determine  if  a  single  peak  could  discriminate  between  the  DAT  and  non-DAT  group, 
receiver  operator  characteristic  (ROC)  curves  were  generated  using  normalized  peak  height.  No 


single  peak  had  area  under  the  curve  (AuROC)  >  0.8,  therefore  none  were  excellent  classifiers  of 
DAT  (Figure  4).  Peaks  nonnalized  using  Glu-Fib  had  a  mean  AuROC  of  0.543,  ranging  from 
0.383  to  0.692.  TIC  normalized  peaks  had  a  mean  AuROC  of  0.538,  ranging  from  0.396  to 
0.754.  The  top  performer  from  TIC  was  peak  301 7  m/z  had  an  AuROC  ±  S.E.  of  0.754  ±  0.054 
(Figure  4).  Four  different  thresholds  were  determined  a  priori  from  the  ROC  curve  and  were 
qualified  with  the  independent  test  set.  Using  a  minimum  mis-classified  threshold  we  achieved 
100%  specificity  but  only  20%  sensitivity  with  8  of  10  DAT  individuals  being  called  incorrectly 
(Table  3).  Further  performance  analysis  of  peak  3017  m/z  using  thresholds  which  increased 
sensitivity  resulted  in  a  decrease  of  specificity. 
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Figure  4.  Individual  peak  AuROC.  Peak  3017  M/Z  had  the  highest  AuROC  =  0.75.  No  other 
peak  demonstrated  a  good  ability  to  classify  sea  lions  with  DAT. 


In  addition  to  evaluating  individual  peak  performance  for  predicting  DAT,  we  were  also 
interested  in  whether  there  were  peaks  that  predicted  individuals  in  the  NAVY  group.  When 
perfonnance  was  evaluated  for  picking  NAVY  samples  versus  non-NAVY,  21  TIC  nonnalized 
peaks  had  AuROC  >  0.8,  and  the  best  perfonner,  1362  m/z,  had  an  AuROC  ±  S.E.  of  0.979  ± 
0.02  (Figure  5).  Interestingly  this  peak  was  mostly  absent  in  sera  collected  at  TMMC.  Using  an 

optimum  threshold,  the  individuals  in  the  NAVY 
group  were  called  correctly  18  of  20  times  (90% 
sensitivity),  and  the  two  mis-called  (#2  and  #9) 
were  both  initial  blood  draws  and  SL#2  showed 
clinical  signs  (behavioral;  poor  perfonnance). 
Moreover  only  four  non-NAVY  group  individuals 
(CSL  6896,  9111,  9271  and  9770)  were  called 
incorrectly  (95%  specificity).  Using  the  same 
threshold  only  one  individual  in  the  independent 
test  set  was  called  NAVY  (data  not  shown;  CSL 
9766,  an  adult  female  with  acute  DAT  which  was 
released). 


Peak  1362  m/z 

Navy  (n=20)  TMMC  (n=87) 


Figure  5.  Peak  1362  m/z  largely  absent 
in  NAVY  samples. 


Since  no  single  peak  was  an  excellent  classifier  of  DAT  (AuROC  >  0.8),  peak  data  were 
modeled  using  artificial  neural  network.  Glu-Fib  and  TIC  normalized  peaks  were  used 
separately,  and  101  artificial  neural  networks  (ANNs)  were  trained  using  three  different 
approaches.  The  first  approach  involved  splitting  the  training  set  into  a  training  set  and 
qualification  set  for  cross-validation,  whereas  the  second  approach  utilized  the  complete  training 
set.  The  third  approach  quantiled  peak  height  across  variables,  and  these  quantiled  data  were 
used  to  train  ANNs.  Additionally,  with  all  three  approaches,  two  types  of  ANNs  were  chosen  for 
qualification:  median  performing  ANNs  (based  on  a  median  AuROC)  or  a  Combinatorial  ANN 

(CANNioi).  In  the 
first  approach,  the 
internal 
qualification  set 
was  used  to  set 
thresholds  for 

either  single  ANNs 
or  CANNioi  and 
the  independent  test 
set  was  run  against 
these  models  or  a 
model(s)  trained 
using  the  complete 
training  set.  In  the 
second  and  third 
approach, 
thresholds  for 
models  were 

determined  by 
ROC  analysis  of 
the  training  set 
without  cross- 
validation. 

The 

generated  models 
were  qualified 
using  a  blinded 
independent  test  set 
of  20  sera,  1 0  DAT 
and  10  non-DAT. 
Using  thresholds 

determined  a  priori,  the  performance  of  each  model  was  evaluated  by  predicting  a  1  or  0  for  each 
of  the  20  sera  (Table  3).  Compared  to  the  single  peak  3017  m/z  which  gave  100%  specificity 
but  only  20%  sensitivity,  using  the  different  ANN  approaches  we  achieved  high  specificity 
(100%)  and  high  sensitivity  (100%).  Specifically,  we  found  the  best  performance  of  Glu-Fib 
normalized  data  was  30%  sensitivity  and  100%  specificity  which  was  achieved  using  a  median 
ANN  (Glu-Fib-ANN.^).  Relative  to  ANNs  generated  with  Glu-Fib,  models  made  using  TIC 
normalized  data  achieved  higher  sensitivity  (100%  versus  40%)  as  well  as  high  specificity 


Table  3.  Qualification  results  of  NRN  or  ANN  models.  TIC,  total  ion 
current  normalized;  Glu-Fib,  glufibrinopeptide  internal  standard 
normalized;  minMC,  minimal  misclassification  threshold;  optCO, 
optimal  threshold;  npvCO,  negative  predictive  value  optimized  threshold; 
CANNioi  ,  combination  of  101  ANN  models;  Median,  median  ANN  model. 


Peak  3017 

NRN 

ANN 

TIC 

minMC 

TIC 

optCO 

TIC 

npvCO 

TIC 

TIC 

Median 

Gfu-Fib 

CANNtoi 

Patients 

Outcome 

1  =  DAT  0  =  Non-DAT 

CSL  7507 

1 

0 

0 

1 

1 

1 

0 

CSL7778 

0 

0 

0 

0 

0 

0 

0 

CSL  7177 

1 

0 

0 

1 

0 

1 

0 

CSL 781 3 

0 

0 

0 

1 

0 

0 

0 

CSL  9006 

1 

0 

0 

1 

0 

1 

0 

CSL  9023 

0 

0 

0 

1 

0 

1 

0 

CSL  8964 

0 

0 

0 

1 

1 

1 

0 

CSL  9278 

1 

0 

0 

1 

1 

1 

0 

CSL  8868 

0 

0 

0 

1 

0 

0 

0 

CSL  9058 

1 

0 

0 

0 

0 

1 

0 

CSL  9790 

0 

0 

0 

0 

0 

1 

0 

CSL  9771 

1 

1 

1 

1 

0 

1 

1 

CSL  9747 

1 

1 

1 

1 

1 

1 

1 

CSL  9353 

0 

0 

0 

1 

1 

1 

0 

CSL  8963 

0 

0 

0 

1 

0 

0 

0 

CSL  9810 

0 

0 

1 

1 

0 

0 

0 

CSL  9250 

1 

0 

0 

1 

1 

1 

0 

CSL  9280 

0 

0 

0 

0 

0 

0 

0 

CSL  9751 

1 

0 

0 

0 

0 

1 

1 

CSL  9766 

1 

0 

0 

1 

0 

1 

0 

Sensitivity 

20% 

20% 

80% 

40% 

100% 

30% 

Specificity 

100% 

90% 

30% 

80% 

60% 

100% 

Pos  Pred  Value 

100% 

67% 

53% 

67% 

71% 

100% 

Neg  Pred  Value 

56% 

53% 

60% 

57% 

100% 

59% 

(90%).  A  negative  predictive  value  of  100%  was  achieved  using  a  median  ANN  (TIC-ANNi) 
which  was  the  highest  seen  in  any  model.  This  model  predicted  all  1 0  DAT  individuals  correctly 
with  four  false  positives.  The  four  individuals  that  were  predicted  incorrectly  cannot  be 
explained  by  sex,  age,  primary  etiology  or  blood  chemistry.  Other  median  TIC  ANNs  had 
different  performance  measures  despite  the  same  AuROC,  and  overall  using  an  optimum  cut-off 
(OC)  when  different  from  a  minimum  mis-classified  (minMC)  cut-off  resulted  in  higher 
sensitivity  with  minimum  loss  to  specificity.  For  example,  in  the  case  of  TIC-ANN67  and 
QuantiledTIC-ANNgg,  the  OC  improved  performance  while  maintaining  the  same  specificity  as 
the  minMC. 

4.  Assess  the  performance  of  2D  gel  electrophoresis  of  plasma  proteins  to  predict  chronic 
domoic  acid  toxicosis  in  sea  lions.  Assess  the  value  of  Artificial  Neural  Network  analysis  to 
enhance  performance  of  diagnostic  tests.  Validate  biomarkers  using  investigator-blinded 
plasma  samples  from  The  Marine  Mammal  Center. 

Thirty  plasma  samples  (known  as  the  training  set)  were  separated  by  2DE  using  large 
format  gels  following  proteominer  depletion.  The  plasma  samples  were  from  10  sea  lions  with 
DAT,  10  sea  lions  with  leptospirosis,  and  10  sea  lions  without  DAT  or  Leptospirosis,  but  with 
another  ailment.  The  rationale  for  choosing  this  distribution  is  a  reflection  of  the  distribution  of 
disease  in  the  stranded  sea  lions  at  the  Marine  Mammal  Center.  Second  dimension  gels  were 
poured  by  hand;  whereas  previously  we  had  purchased  from  Bio-Rad.  Because  of  quality  issues 
from  the  supplier,  we  decided  to  pour  gels  in-house  and  rerun  several  of  the  samples.  Gels  were 
post-stained  with  Sypro  Ruby,  imaged,  and  analyzed  by  Progenesis  Same  Spots.  In  the  case 
where  gels  did  not  focus  completely,  the  process  was  repeated  for  these  samples  until  we  could 
obtain  a  match  set  of  high  quality  images  for  training  downstream  models.  Statistical  analysis 
was  conducted  and  Receiver  operating  characteristic  (ROC)  curves  were  calculated  for  each  spot 
to  investigate  whether  one  protein  spot  could  be  utilized  as  a  biomarker  of  DAT  alone. 

Artificial  neural  network  models  were  developed  from  the  sea  lion  training  set  data.  Statistical 
perfonnance  measures  and  area  under  receiver  operating  characteristic  (ROC)  curves  were 
calculated  for  each  of  the  matched  spots  (618  spots  total).  A  total  of  50  spots  were  selected  for 
model  inclusion  and  identification  based  on  the  ability  of  the  spot  to  predict  an  outcome  of  DAT 
(area  under  ROC  curve>0.75)  or  Q-value  <0.05. 

To  validate  candidate  markers  and  models,  20  plasma  samples  blinded  to  the  investigators  were 
sent  from  the  Marine  Mammal  Center.  Each  sample  was  treated  the  exact  manner  as  the  training 
set  including  proteominer  depletion.  Technicians  at  the  Marine  Mammal  Center  were  directed  to 
take  plasma  samples  from  stranded  sea  lions  with  a  diagnosis  of  chronic  domoic  acid  toxicosis. 
Samples  were  restricted  to  protocol  criteria  that  adhered  to  those  set  for  the  training  set. 

Large  format  gels  were  post-stained  with  Sypro  Ruby,  imaged,  and  analyzed  by  Progenesis  Same 
Spots.  In  the  case  where  gels  did  not  focus  completely,  the  process  was  repeated  for  these 
samples  until  we  could  obtain  a  match  set  of  high  quality  images  for  spot  alignment.  Test  set  gels 
were  automatically  matched  against  the  training  set  master  and  manually  validated.  Spot 
numbers  were  assigned  according  to  the  training  set  master  image.  Nonnalized  spot  volumes 
were  extracted  from  the  test  data  and  exported  for  marker  and  model  qualification. 


Validated  protein  markers  of  interest  were  identified  LC/MS/MS.  Tryptic  peptides  were 
separated  by  nano-LC  and  eluted  into  a  5600  triple  TOF  mass  spectrometer  using  standard 
protocols.  The  top  ten  masses  were  selected  for  MS/MS  fragmentation.  Data  were  converted  and 
searched  in  MASCOT  using  lOppm  parent  ion  tolerance  and  0.5Da  MS/MS  tolerances. 
Oxidation  of  methionine  and  carbamidomethylation  of  cysteine  was  chosen  as  modifications. 
Data  were  searched  against  a  refined  proteomic  database  constructed  in  our  lab  using  data 
downloaded  from  Swissprot.  Species  included  in  the  database  include,  Panda,  Dog,  Human, 
Mouse,  Sea  Lion.  Rat,  Pig,  and  Cow  in  an  effort  to  maximize  discoverability,  yet  minimize  false 
discovery.  The  criteria  for  assigning  an  ID  to  a  protein  was  met  when  the  MASCOT  score  for  a 
protein  exceeded  80  and  at  least  two  peptides  matched  a  mammalian  protein  in  the  large 
database.  Identifications  (34/50)  were  supported  from  the  initial  gel  analysis  and  were  used.  One 
protein  spot  originally  identified  as  desmoplakin  3  (area  under  ROC  curve  =  0.08)  was  not 
supported  in  the  second  protein  identification  run  and  thus  was  not  considered  a  reasonable 
identification.  At  this  time,  this  protein  spot  is  considered  ‘unknown’.  However,  an  additional 
protein  spot  was  identified  with  higher  confidence,  thus  returning  our  total  number  of  protein 
identified  to  35/50.  15/50  protein  spots  remain  unidentified  most  likely  due  to  a  lack  of  species 
specific  protein  information. 

A  volcano  plot  of  the  training  set  protein  comparison  is  shown  in  Figure  6.  Notably,  most 
proteins  did  not  exhibit  a  greater  than  2-fold  change  in  abundance  and  only  two  proteins  were 
statistically  different  when  corrected  for  false  discovery  rate  (Apolipoprotein  E).  These  proteins 

were  reported  in  the  2011/12 
report  along  with  the  statement 
that  these  proteins  require  further 
qualification  as  potential  markers 
of  DAT  to  be  determined  in  the 
investigator-blinded  test  set. 
Protein  spot  volumes  in  the 
matched  test  validation  set  were 
separately  compared  to  determine 
whether  this  statistical  difference 
held  true  or  was  non-reproducible 
event.  Table  1.  lists  the  identified 
proteins  by  direction  in  fold 
change  and  an  associated  p-value 
for  comparison.  Numerous 
protein  fonns  of  Apolipoprotein  E 
(APOE)  were  significantly  lower 
in  abundance  in  the  DAT  group. 
The  two  spots  that  were 
statistically  lower  in  the  training 
set  were  2159  and  3486,  both  of 
which  remained  statistically  lower 
in  the  test  set  providing  some 
validation  to  the  training  set 
relationship.  Perhaps  confusing  is 
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Figure  6.  Volcano  plot  of  fold  change  vs.  p-value  for 
protein  spot  abundance  in  the  training  set.  Only  2  protein 
spots  (Apolipoprotein  E)  were  significantly  lower. 


the  fact  that  so  many  of  the  spots  are  identified  as  APOE  and  that  APOE  is  found  in  both  the 
higher  abundance  group  and  lower  abundance  group.  Because  proteins  are  separated  in  2 
dimensions  and  the  migration  of  the  protein  is  a  function  of  charge  and  molecular  weight,  any 
modification  of  the  protein  can  affect  the  distribution  of  said  protein  in  the  gel.  In  the  case  of 
modification,  proteins  migrate  to  multiple  spots  in  the  gel  and  for  the  statistical  comparison,  each 
spot  is  treated  as  an  independent  observation.  In  the  case  of  apolipoprotein  E,  we  know  that  this 
protein  is  o-glycosylated  in  sea  lions[l]  and  because  we  know  the  protein  sequence,  the  identity 
of  this  protein  is  of  high  confidence.  Other  protein  values  listed  in  Table  4  were  included 
because  they  demonstrated  predictive  ability  in  the  training  set  i.e.  area  under  the  ROC  curve 
was  >0.75.  Of  interest  is  the  fact  that  15/21  proteins  with  negative  fold-change  values  were 
lipoproteins;  whereas  the  proteins  listed  in  the  higher  fold-change  group  tended  to  be  more 
evenly  distributed  amongst  common  plasma  proteins. 

Higher  or  Equal  in  Abundance  Lower  or  Equal  in  Abundance 


Protein  Name 

Fold- 

change 

P- 

Value 

Protein  Name 

Fold- 

change 

P- 

value 

ApoE 

3.0 

0.17 

ApoE 

-8.8 

0.047 

Complement  C4-A 

2.3 

0.49 

ApoA-IV 

-5.1 

0.18 

Vitronectin 

2.2 

0.13 

ApoE 

-5.0 

0.03 

Fibrinogen  gamma  chain 

1.8 

0.22 

ApoE 

-4.5 

0.008 

Carboxypeptidase  N  subunit  2 

1.7 

0.23 

ApoE 

-3.7 

0.015 

Vitamin  D-binding  protein 

1.6 

0.71 

ApoA-IV 

-3.6 

0.002 

Albumin 

1.4 

0.04 

ApoA-IV 

-2.8 

0,005 

ApoA-IV 

1.3 

0.01 

Clusterin 

-2.4 

0.003 

Antithrombin-Ill 

1.3 

0.11 

ApoE 

-2.3 

0.02 

Hemoglobin  subunit  gamma 

1.2 

0.55 

Actin,  cytoplasmic  1 

-2.2 

0.67 

Fibrinogen  gamma  chain 

1.2 

0.00 

Clusterin 

-2.2 

0.78 

EGF-containing  fibulin-like  ECM 

protein  1 

1.0 

0.90 

ApoE 

-1.9 

0.05 

ApoE 

1.0 

0.63 

Immunoglobulin  J  chain 

-1.8 

0.19 

ApoE 

-1.8 

0.06 

ApoE 

-1.7 

0.64 

Similar  to  Kappa  Light  Chain  -1 .7 

0.052 

heavy  chain  variable  region 

-1.6 

0.11 

Similar  to  IgJ 

-1.4 

0.336 

Glutathione  peroxidase 

-1.4 

2E-04 

ApoE 

-1.3 

0.31 

ApoA-1 

-1.2 

0.3 

Table  4.  Fold-change  abundance  of  protein  spots  in  the  qualification  test  set  that  were 

statistically  different  or  included  in  the  training  set  based  on  area  under  ROC  curve.  Proteins 
higher  in  abundance  in  sea  lions  with  DAT  are  listed  in  the  left-hand  column.  Proteins  lower  in 
abundance  in  sea  lions  without  DAT  are  listed  in  the  right  hand  column.  Apolipoproteins  E  and 
AIV  dominate  the  population  of  spots  that  are  lower  in  abundance. 


Artificial  neural  networks  created  from  the  training  set  data  were  tested  using  the  test  set  data. 
Networks  are  first  created  using  all  50  spot  data  and  then  sensitivity-values  (s-values)  are 
tabulated.  Networks  are  cross -validated  internally  to  reduce  overtraining,  but  cross-validation 
does  not  remove  the  possibility  of  overtraining.  Because  we  realize  that  a  2D  gel  is  unlikely  to  be 
an  assay  for  a  biomarker  panel,  we  reduce  the  features  included  in  the  network  by  extracting  the 
s-values  (weighting  factors)  and  iteratively  re-computing  the  networks  using  only  the  highest  s- 
values.  For  example,  21  models  are  built  with  only  two  spot  data  with  the  highest  s-values.  Then 
21  models  are  built  with  only  three  spot  data  with  the  highest  s-values  and  so  on.  Area  under  the 
curves  are  plotted  and  the  point  at  which  the  models’  performance  plateaus  is  the  minimal 
number  of  spots  needed  for  the  maximal  predictive  ability. 

The  final  model  we  selected  included  8  protein  spots.  From  these  8  protein  spot  data  in  the 
training  set  we  created  101  neural  networks  and  allowed  these  networks  to  assign  a  prediction. 
Each  prediction  was  then  recorded  as  a  0  =  nonDAT  or  a  1  =  DAT.  The  votes  were  tabulated  and 
the  highest  number  of  votes  determined  the  classification.  This  was  done  to  remove  any 
investigator  bias  in  model  selection.  Once  completed,  we  tested  our  model’s  ability  to  predict  the 
diagnosis  in  the  investigator-blinded  test  set.  Once  the  key  was  revealed,  we  noticed  that  the 
model  tended  to  severely  overestimate  the  DAT  classification  to  the  point  that  nearly  all  sea  lions 
were  considered  to  be  afflicted  with  DAT. 

Prior  to  the  neural  network  modeling  we  had  also  created  a  series  of  smaller  models  using  the 
two  APOE  spot  data  and  clinical  variables  eosinophil  count  and  hematocrit.  The  rationale  for 
choosing  these  data  were  because  we  had  found  and  published  a  significant  relationship  between 
acute  DAT,  eosinophil  count,  and  hematocrit[2].  Individual  APOE  spot  data  was  tested  for 
marker  performance  in  the  test  set  as  well  as  eosinophil  count  and  hematocrit.  These  variables 
were  further  combined  using  neural  networks  to  establish  relationships  that  could  be  tested  in  the 
test  set.  Individually,  the  APOE  spots  performed  below  expectation  (spot  2159,  Sensitivity 


Input  Data 

Sensitivity 

Specificity 

trAUC 

Eosinophil  count 

71% 

92% 

0.53 

Hematocrit 

71% 

46% 

0.79 

2  APOE  spots  + 

Eosinophils  + 
Hematocrit 

86% 

85% 

1 

1  APOE  spot 

(2159)  + 

Eosinophils 

86% 

85% 

1 

Table  5.  Statistical  performance  measures  for  eosinophil  count,  hematocrit  and  combinations 
of  APOE  spots  with  clinical  values.  Sensitivity  and  Specificity  are  based  on  thresholds 
determined  in  the  training  set.  Area  under  the  ROC  curve  for  the  training  set  is  given  in  the 
column  trAUC. 


100%,  specificity  38%;  spot  3486,  Sensitivity  86%,  Specificity  100%,  Sensitivity  62%). 


In  Table  5  sensitivity  and  specificity  were  calculated  for  calculated  for  eosinophil  count, 
hematocrit,  APOE  spots  2159+3486  +  eosinophil  count+  hematocrit,  and  APOE  spots 
2159+Eosinophil  count.  Combinations  of  APOE  spot  data  and  clinical  data  were  assimilated  into 
101  feed-forward  neural  networks  to  allow  voting  as  described  above.  Eosinophil  count  alone 
was  a  poor  predictor  of  chronic  DAT  in  the  training  set,  although  appears  to  be  a  reasonably 
good  predictor  for  the  test  set.  Hematocrit  was  a  good  predictor  of  chronic  DAT  in  the  training 
set,  but  is  a  poor  predictor  in  the  test  set.  The  combination  marker  of  APOE  spots  and  clinical 
variables  were  a  perfect  predictor  in  the  training  set  and  remained  very  good  at  classifying  the 
test  set.  The  ability  to  classify  correctly  held  true  even  when  only  a  single  APOE  spot  (2159)  and 
eosinophils  were  utilized  in  combination.  In  fact,  there  was  no  difference  between  performance 
measures  between  2APOE  spots+eosinophils+hematocrit  and  a  single  APOEspot+eosinophils. 

We  have  been  attempting  to  determine  whether  total  APOE  levels  can  be  used  to  discriminate 
between  sea  lions  with  DAT  and  nonDAT.  The  directionality  of  abundance  for  APOE  forms  in 

the  2DE  experiments  suggested  that  a 
majority  of  the  spots  are  lower  in 
abundance  in  sea  lions  with  chronic 
DAT  compared  to  other  stranded 
nonDAT  sea  lions.  Although  one 
APOE  protein  spot  was  labeled  as 
elevated  with  regard  to  fold-change  it 
was  not  statistically  significant.  This 
suggests  that  absolute  levels  of  APOE 
coupled  with  eosinophil  counts  may 
suffice  as  a  marker  and  would  be  better 
positioned  for  a  point  of  care  test  than 
charge  and  weight  form  measurements 
by  2DE.  With  this  rationale  in  mind,  we 
chose  to  investigate  absolute  levels  by 
immunoblotting  for  APOE.  One 
important  question  that  we  wish  to 
answer  with  the  immunoblot  is:  will  the 
relationship  of  APOE  with  DAT  exist 
when  we  nonnalize  the  levels  to 
volume  of  plasma  vs.  fraction  of 
protein.  This  is  often  overlooked,  but 
extremely  important  as  clinical  tests  are 
reported  per  volume  of  body  fluid  or  as 
a  ratio  of  a  second  measured  analyte. 
They  are  not  reported  as  per  mg 
protein,  although  this  could  be  done. 

lmmunoblots  of  APOE  for  sea  lions  has 
been  published  [1]  on  delipidated  and 
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Figure  7.  Immunoblot  for  sea  lion  APOE  in  plasma 
samples  from  animals  with  DAT  and  no  DAT.  High 
cross-reactivity  with  other  plasma  proteins  has 
confounded  the  validation  of  APOE  measurements. 
An  immunoreactive  band  at  37kDa  corresponds 
with  the  correct  size  of  APOE  for  sea  lion.  Although 
antibody  specificity  remains  a  question,  it  is 
interesting  that  this  band  at  37kDa  has  the  same 
quantitative  relationship  as  APOE  as  determined 
by  2DE  (Bar  graph). 


semi-pure  apolipoprotein  fractions.  The  PI  contacted  Dr.  Steve  Young  to  acquire  an  aliquot  of 
the  antibodies  utilized  in  the  1991  paper  [1],  Unfortunately  Dr.  Young  was  unable  to  located  the 
antiserum  utilized  22  years  prior.  We  then  took  advantage  of  the  high  similarity  between  the  c- 
terminal  domains  of  human  and  sea  lion  APOE  proteins  and  ordered  a  commercial  antibody  from 
Abeam  for  immunoblot  analysis.  The  cross-reactivity  of  the  antibody  was  such  that  it  was 
primarily  reactive  against  human  APOE,  but  was  listed  to  cross  react  with  mouse  and  rat  APOE 
as  well.  Figure  7  illustrates  a  typical  immunoblot  for  APOE  in  sea  lion  plasma.  To  test  the 
antibody’s  ability  to  cross-react  with  a  protein  of  similar  molecular  weight  to  APOE,  lanes  were 
loaded  with  a  volume  of  plasma  equivalent  to  0.3pl  which  equals  approximately  21  pg 
protein/lane.  Two  different  samples  were  loaded,  one  sample  was  from  a  sea  lion  with  DAT  and 
the  other  was  from  a  sea  lion  without  DAT.  The  APOE  protein  in  sea  lions  is  approximately 
37kDa  in  weight,  but  has  been  reported  to  migrate  as  two  separate  bands  due  to  o-glycosylation. 
The  molecular  weight  is  significantly  higher  than  human  APOE.  Our  results  demonstrate  that  a 
protein  of  37kDa  is  detected  with  the  commercial  APOE  antibody;  however,  as  is  typical  with 
immunoblotting  of  sea  lion  plasma,  there  is  a  high  cross-reactivity  with  other  plasma  proteins. 
This  problem  has  been  described  in  previous  reports  from  our  laboratory  and  does  not  appear  to 
be  due  to  the  choice  of  secondary  antibody  as  we  have  exhaustively  screen  secondary  antibodies 
from  commercial  vendors  and  have  utilized  different  dilutions  and  blocking  reagents.  Sea  lion 
plasma  immunoblots  are  consistently  dirty  and  data  gained  using  this  technique  to  validate 
identifications  and  markers  was  inconsistent. 

We  further  compared  the  APOE  2DE  results  from  spot  2159  with  the  immunoblot 
measured  by  densitometry.  Setting  the  non-DAT  protein  abundance  data  to  1,  the  protein 
abdundance  data  from  the  sea  lion  with  DAT  was  set  relative  to  the  non-DAT  sample.  In  both  the 
immunoblot  and  2DE  comparison,  the  abundances  correlate  very  well  and  were  almost  identical. 
This  result  suggested  that  the  immunoblot  method  may  be  a  workable  method  from  which  to 
validate  the  APOE  results  for  sea  lions.  The  question  that  still  remains  is:  Is  this  37kDa  band 
really  APOE  or  is  this  an  artifact  of  some  non-specific  protein  that  varies  in  a  similar  direction  as 
APOE  and  is  of  a  similar  molecular  weight.  Before  addressing  this  question  we  decided  to  run  a 
larger  set  of  plasma  samples  to  determine  whether  the  relationship  is  consistent.  52  sea  lion 
plasma  samples  were  run  as  described  above  (12%  Bis-Tris  PAGE).  The  gels  were  blotted  and 
membranes  probed  for  “APOE”.  The  results  demonstrated  that  the  37kDa  band  was  detectable  in 
about  50%  of  the  samples.  Because  we  loaded  a  standard  reference  on  both  sides  of  the  gel,  we 
were  able  to  determine  immediately  that  the  problem  with  detection  was  due  to  inefficient 
transfer.  We  have  spent  the  last  two  months  attempting  different  transfer  apparati,  buffers  and 
running  conditions  to  no  avail.  This  coupled  to  the  fact  that  we  are  not  certain  that  the  antibody  is 
specific  has  created  a  conundrum.  At  this  point  we  are  currently  discussing  whether  to  continue 
the  validation  by  immunoblot  or  completely  redirect  the  validation  toward  parallel  reaction 
monitoring  (quantitative  tandem  mass  spectrometry)  assays  that  we  have  developed  for  cerebral 
spinal  fluid  proteins.  We  are  leaning  toward  the  later  because  specificity  of  detection  is  very  high 
confidence  with  mass  spectrometry  as  parent  and  product  ions  for  sea  lion  APOE  are  detected  in 
undepleted  plasma. 

5.  Ancillary  Studies:  compare  cerebral  spinal  fluid  proteins  by  tandem  mass  spectrometry 
between  acute  DAT,  chronic  DAT,  and  non-DAT  sea  lions.  Compare  serum  from  DAT  and 
non-DAT  sea  lions  using  iTRAQ. 


Tandem  Mass  Spectrometry  and  cerebral  spinal  fluid  bioniarkers. 

In  August  2010,  the  grant  N000140810341  was  reviewed.  In  the  review,  the  referee  suggested 
that  the  project  should  consider  the  analysis  of  cerebral  spinal  fluid  (CSF)  in  the  effort  to 
discover  biomarkers  of  DAT.  Due  to  major  limitations  in  CSF  sample  availability  we  had  not 
previously  considered  this  fluid  for  analysis.  Based  on  the  reviewer’s  recommendation  we 
reassigned  effort  from  the  initial  aim  3  proposed  in  the  grant  proposal  to  initiate  a  proteomic 
investigation  of  cerebral  spinal  fluid  from  sea  lions.  Due  to  sample  limitations  at  The  Marine 
Mammal  Center,  we  were  limited  to  12  CSF  samples  for  the  initial  analysis.  A  detailed 
description  of  the  samples  and  rationale  was  provided  in  the  2010/1 1  Annual  report. 


To  summarize  our  label-free  proteomic  findings  from  cerebral  spinal  fluid  analysis.  There  were 
five  CSF  proteins  that  were  considered  differentially  abundant  (Figure  8)  based  on  Fisher’s 
Exact  test:  immunoglobulin  lambda  6C  (IgX6-C),  Gelsolin  (GSN),  Dickkopf-3  (DKK3), 
Neuronal  Cell  Adhesion  Molecule  1  (NCAM1),  and  Oligodendrocyte  myelin  glycoprotein 
(OMG).  Individually,  all  the  proteins  appear  to  have  some  value  as  biomarkers  to  discriminate 
DAT  from  non-DAT.  To  show  that  the  proteins  together  can  completely  separate  both 
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Figure  8.  Spectral  count  data  for  each  individual  sea  lion  respective  of  5  significant  proteins 
identified  using  tandem  mass  spectrometry.  Dashed  line  represents  a  threshold  value  that  can 
best  discriminate  between  DAT  and  nonDAT. 


populations,  we  used  principal  components  analysis  to  group  proteins  in  an  unsupervised 
manner.  Based  on  these  5  proteins  alone,  two  groups  of  sea  lions  could  be  easily  discerned 

(Figure  9). 


PCI 

Figure  9.  Principle  components  analysis  of  5  statistically  different  proteins  from  DATand  non- 
DAT  CSF  samples. 


We  attempted  to  validate  four  out  of  five  proteins  in  Figure  3  using  Western  blot  analysis, 
but  commercial  antibodies  that  may  have  cross-reacted  with  sea  lion  proteins  did  not  provide 
sufficient  evidence  for  appropriate  cross-reactivity  based  on  a  positive  control  sample  of  rat 
brain.  Although  bands  were  visualized  by  western,  marked  differences  predicted  molecular 
weight  gave  us  pause  in  believing  the  data  at  least  until  we  can  verify  the  complete  sea  lion 
protein  sequence. 

A  sixth  protein  called,  Reelin,  was  identified  by  tandem  mass  spectrometry  and  was 
numerically  lower,  but  did  not  show  a  statistical  change  in  abundance  (P<0.12).  Reelin  is  an 
interesting  protein  that  is  depressed  in  schizophrenia,  Alzheimer’s  disease,  and  epilepsy.  Reelin 
serves  as  a  ligand  to  the  ApoE  receptor  2  and  very  low  density  lipoprotein  receptors  and  has  been 
shown  to  inhibit  granular  cell  dispersion  in  the  hippocampus  of  mice  dosed  with  kainite 
(analogous  to  domoic  acid).  Due  to  the  relevant  nature  of  this  protein  to  neurological  disorders 
that  parallel  domoic  acid  toxicosis,  we  decided  to  validate  the  directionality  of  Reelin  abundance 
and  determine  whether  modest  differences  may  exist  that  are  not  detectable  using  label-free 
tandem  mass  spectrometry  quantification  which  is  inherently  variable.  We  utilized  a  western 
blotting  approach  using  an  antimouse-Reelin  antibody.  The  anti-Reelin  antibody  gave  better  than 
expected  results  and  showed  cross-reactivity  with  a  low  molecular  weight  Reelin  protein  of 


expected  size  (1 80kDa).  Non-specific  cross-reactivity  was  noticeably  absent  as  compared  with 
sea  lion  plasma. 

We  then  loaded  a  4-12%  polyacrylamide  gel  with  the  same  samples  used  for  mass 
spectrometry  and  ran  the  gel  under  denaturing  conditions.  Lanes  were  loaded  according  to 
volume  CSF  (lOpl  CSF)  rather  than  total  protein  to  reproduce  a  clinical  unit  measure  similar  to 

what  is  utilized  in  a  diagnostic  assay 
e.g.  mg/dL  vs.  mg/g  protein.  In  this 
case,  because  sea  lions  with  DAT  had 
higher  total  CSF  protein  concentration 
on  average,  the  amount  of  protein 
loaded  onto  DAT  lanes  exceeded  that  of 
control  sea  lions.  Based  on  western  blot 
results,  Reelin  protein  in  CSF  of  sea 
lions  with  DAT  was  about  1.4-fold 
lower  (PO.OOl)  compared  to  control 
animals  (Figure  10).  These  data 
suggest  that  domoic  acid  toxicosis  in 
sea  lions  shares  a  common  mechanism 
with  other  mammalian 

neurodegenerative  diseases  in  part 
through  Reelin  signaling.  Secondly,  the 
western  blot  data  support  the  mass 
spectrometry  identification  and 

quantitative  directionality  i.e.  low 
spectral  counts  are  equivalent  to  less 
protein.  Thirdly,  control  animals  were 
nonDAT,  but  one  animal  did  have 
encephalopathy  suggesting  that 

differences  in  Reelin  abundance  may 
indeed  be  much  larger  if  healthy  animals  or  animals  without  brain  injury  were  compared. 

The  utility  of  Reelin  alone  as  a  biomarker  for  DAT  is  not  exceptional  [Sensitivity  62%, 
Specificity  50%,  Positive  predictive  value  100%,  Negative  Predicitive  Value  50%]  but  may  also 
reflect  other  neuronal  injury  than  DAT.  Together  with  the  5  protein  panel  of  CSF  candidate 
markers,  it  may  be  able  to  lend  more  confidence  to  a  classification.  Additionally,  the  changes  in 
Reelin  offer  important  mechanistic  insight  into  the  mechanism  of  DAT  in  sea  lions.  Establishing 
common  patterns  of  protein  abundance  between  sea  lions  and  other  mammalian  laboratory 
animal  models  provides  the  opportunity  to  draw  parallels  between  tightly  controlled  laboratory 
experiments  and  wild  sea  lion  populations  that  under  the  marine  mammal  protection  act  cannot 
be  utilized  for  comparative  experimentation. 

Based  on  the  interesting  results  from  the  CSF  protein  marker  study,  we  wanted  to 
determine  whether  a  panel  of  protein  markers  could  be  constructed  for  this  fluid  that  would  allow 
us  to  create  a  diagnostic  platform  for  the  measurement  of  these  proteins  in  sea  lion  CSF.  As 
mentioned  prior,  western  blots  only  worked  for  the  protein  Reelin,  so  antibody-based  assays  are 
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Figure  10.  Reelin  abundance  measured  by  tandem 
mass  spectrometry  and  western  blotting.  Although 
not  statistically  different  by  tandem  mass 
spectrometry  analysis,  Reelin  was  lower  measured  by 
Western  blotting  (P<0.05,  T-test). 


not  a  likely  option  for  determining  sea  lion  protein  CSF  protein  abundance  for  the  candidate 
markers  listed.  Additionally,  it  is  important  that  measurements  be  made  and  reported  in  SI  units 
because  standardized  diagnostics  are  rarely  conducted  using  relative  values  (exceptions  being 
cDNA  array  data  for  cancer  diagnostics  and  immunohistochemistry)  and  the  effects  of  matrices 
can  skew  mass  spectrometry  data  if  internal  standards  are  not  available  for  estimating  these 
effects.  To  that  end,  we  constructed  mass  spectrometry  assays  similar  to  select  reaction 
monitoring  (also  referred  to  as  select  ion  monitoring)  assays  using  our  ABSciex  5600  Triple  TOF 
instrument  for  all  six  proteins.  Quantification  by  ion  monitoring  involves  a  stable  isotope  internal 
standard  that  is  chemically  identical  to  the  tryptic  peptide  selected  for  monitoring  with  the 
exception  that  one  of  the  amino  acids  in  the  synthetic  peptide  contains  l3C/15N  thereby  making 
this  peptide  slightly  heavier  in  mass.  For  all  practical  purposes,  the  peptide  will  elute  and 
fragment  the  same  as  a  native  peptide,  but  because  it  is  slightly  heavier,  will  be  detected  at  +8  or 
+10amu  if  lysine  or  arginine  is  labeled.  Therefore,  the  mass  spectrometer  is  able  to  visualize  both 
peptides  at  the  same  time  and  fragment  the  parent  ion  into  b  and  y  ions  for  identification. 
Measurements  are  made  on  the  fragment  ions  which  offers  an  additional  level  of  specificity  to 
the  assay  in  the  case  other  parent  ions  overlap  in  mass  at  a  specific  elution  time.  Secondly,  by 
doing  this  type  of  standardization,  we  are  able  to  validate  our  peptide  identifications  based  on 
comparisons  of  elution  time  and  fragment  ion  spectra. 

An  example  data  profile  is  shown  in  Figure  1 1 .  Five  microliters  of  sea  lion  CSF  is  tryptic 
digested  and  6  internal  peptide  standards,  synthesized  and  quality  controlled  by  New  England 
Peptide,  were  spiked  into  the  a  CSF  sample  at  known  concentration  prior  to  peptide  isolation  by 
solid  phase  extraction.  Peptides  +  standards  are  injected  onto  a  nano  cl 8  column  and  data 
acquired  across  an  elution  gradient  for  30  mins.  Product  ion  scans  are  made  for  specific  parent 
ion  masses  relevant  to  the  sea  lion  peptides  we  are  monitoring  and  fragment  spectra  are  recorded 
for  each  ion  with  a  mass  within  lamu  of  the  entered  mass.  Fragment  ion  masses  specific  for  the 
peptides  of  interest  and  their  associated  standards  are  extracted  and  intensity  plotted  across  time 
(Figure  11B).  Area  under  the  extracted  ion  chromatograms  for  both  the  standard  and  native 
peptide  are  compared  and  the  ratio  between  the  two  are  used  to  calculate  endogenous 
concentration.  In  Figure  6B  we  show  the  quantification  of  sea  lion  NCAM1  and  the  linear  range 
of  this  this  peptide  in  Figure  11C.  As  proof  of  concept  we  measured  all  six  candidate  markers  in 
a  single  sea  lion  CSF  that  had  DAT  to  determine  whether  the  standards  could  be  used  to  measure 
tryptic  peptides  in  sea  lion.  All  standard  peptides  and  endogenous  tryptic  peptides  displayed 
identical  elution  profiles  and  fragment  ion  spectra  (shifted  +8  or  +10  for  the  standard) 
confirming  the  identity  of  the  peptide.  A  measurement  for  each  candidate  protein  concentration 
is  compiled  in  Table  1.  The  multiplexed  protein  assay  provided  reliable  numbers  that  is  specific 
for  sea  lion  proteins,  can  be  compared  across  many  sea  lion  CSF  samples,  and  importantly  can  be 
directly  linked  to  a  standard  peptide. 
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Figure  11.  Reaction  monitoring  assay  for  sea  lion  NCAM1.  A)  total  ion  current  of 
tryptic  CSF  peptides.  NCAM1  peptide  elutes  at  10.44min  (red  box).  B)  Extracted  ion 
chromatograms  for  NCAM  1  sea  lion  peptide.  The  internal  standard  containing  13C/15N 
elutes  at  the  identical  time  point  as  sea  lion  native  NCAM1.  C)  Dilution  curve  showing 
linearity  of  NCAM1  from  1  fmol  to  200  fmol. 


Table  6.  Six  CSF  proteins  measured  by  mass  spectrometry  reaction  monitoring  using  the 
assay  developed  in  our  laboratory.  Protein  abundance  can  now  be  reported  in  SI  units 
instead  of  relative  spectral  counts  making  this  a  standardized  measurement  specific  to  sea 
lions. 

CSF  Protein  Name 

fmol 

fmol/pl  CSF 

Gelsolin 

4.91 

8.2 

Oligodendrocyte  Myelin  Glycoprotein  (OMG) 

24.7 

41 .2 

Reelin 

13.6 

22.6 

IgG  lambda  6-C 

8.1 

13.6 

NCAM  (CD56) 

2.2 

3.7 

Dickkopf  3  (DKK3) 

1.3 

2.1 

At  this  point,  it  is  important  to  understand  whether  these  candidate  markers  using  standardized 
measurements,  but  also  to  ask  the  question  whether  this  information  can  guide  therapeutic  trials 
for  studies  of  intervention.  One  of  the  major  hurdles  to  therapeutic  intervention  is  knowing 
which  mechanisms  parallel  published  studies  in  rodent  models  thereby  creating  precedent  for 
informed  clinical  studies  to  proceed.  Several  studies  have  suggested  that  domoic  acid  causes 
temporal  lobe  epilepsy  in  humans,  rats,  and  sea  lions.  Temporal  lobe  epilepsy  is  partially 
characterized  by  a  widening  of  the  dentate  gyrus  granular  cell  layer  known  as  granular  cell 
dispersion  (GCD).  However,  in  sea  lions  there  is  an  acute  necrosis  of  granular  cells  that  is  not 
characteristic  of  excitotoxic  induced  injury  or  epilepsy.  Scientists  have  speculated  that  a 
“sensitivity”  due  to  limbic  seizures  is  due  to  a  possible  adaptation  to  hypoxia;  however  it  is  still 
not  known  as  to  why  granular  cells  in  sea  lions  become  “necrotic”  following  excitotoxic  injury. 

The  protein  phenotype  in  the  CSF  of  sea  lions  with  domoic  acid  toxicosis  does  not  fit 
neatly  into  expression  profiles  of  temporal  lobe  epilepsy  and  may  be  specific  to  excitotoxic 
injury  in  sea  lions  or  denote  some  novel  mechanistic  feature  of  domoic  acid  toxiciosis  that  has 
not  yet  been  revealed. 

iTRAQ  Analysis  of  Plasma  Proteins 

Although  not  specifically  listed  in  the  grant  proposal,  we  decided  to  apply  advance  proteomic 
techniques  to  the  study  of  plasma  proteins  from  sea  lions  with  DAT.  Two-dimensional  liquid 
chromatography  tandem  mass  spectrometry  (2D  LC/MS/MS)  is  commonly  utilized  in  biomarker 
discovery  programs  for  human  clinical  studies  as  a  targeted  approach. 

For  this  study,  we  acquired  8  plasma  samples  from  age/sex-matched  sea  lions  (adult  females) 
stranded  in  2009.  Plasma  samples  were  drawn  within  a  few  days  of  stranding  in  the  DAT 
animals.  For  non-DAT  plasma  samples,  animals  were  allowed  to  recover  from  injury  and  plasma 
was  collected  just  prior  to  release.  We  did  this  because  we  were  more  interested  in  understanding 


which  plasma  proteins  were  changing  in  animals  with  acute  DAT  versus  a  relatively  healthy  sea 
lion  without  DAT.  Plasma  samples  were  thawed  at  MUSC  and  normalized  using  Proteominer 
beads  to  decrease  albumin  and  immunoglobulin  abundance.  Proteins  were  digested  using  trypsin 
and  peptides  labeled  using  isobaric  tags  (iTRAQ  reagent,  Applied  Biosystems).  Peptides  were 
combined  and  separated  by  strong  cation  exchance  spin  column  prior  to  separation  on  a  cl  8 
column.  Fractions  were  spotted  to  a  MALDI  target  and  data  acquired  using  an  ABI  4800  TOF- 
TOF  mass  spectrometer.  Proteins  were  quantified  based  on  size  tags  and  identification  assigned 
using  MASCOT  against  the  canine  protein  database.  Protein  quantification  was  collated  using 
iQUANTITATOR  and  in-house  program  designed  by  Dr.  John  Schwacke  at  MUSC. 

Before  discussing  protein  results,  it  is  noteworthy  to  point  out  that  of  4645  spectra  acquired  by 
mass  spectrometry,  4260  were  not  assigned  to  proteins.  This  fact  largely  points  to  a  need  for  a 
sea  lion  genome  database  for  any  downstream  LC/MS/MS  intensive  studies.  Small  differences  in 
amino  acid  sequence  will  not  always  pennit  peptide  matching  and  protein  identification.  Because 
dogs  are  most  closely  related  to  sea  lions,  we  speculated  that  this  database  would  provide  the 
best  possible  match  against  the  sea  lion  peptide  spectra. 

From  the  iTRAQ  experiment,  86  unique  proteins  identified  by  at  least  a  single  peptide  were 
quantified.  Only  ten  proteins  were  identified  by  two  or  more  peptides.  Analysis  of  expression 
change  magnitude  values  (nonnalized  fold  change)  indicated  only  one  protein  was  significantly 
different  between  sea  lions  with  DAT  and  sea  lions  without  DAT.  Serum  amyloid  A  was 
statistically  higher  in  DAT  sea  lions;  however,  did  not  perform  well  as  a  biomarker  as  only  two 
sea  lions  with  DAT  had  levels  that  were  distinguishable  from  those  sea  lions  without  DAT. 


6.  Conclusions 


Utilizing  a  suite  of  targeted  and  non-targeted  proteomic  approaches  coupled  with  machine 
learning  tools,  there  is  no  perfect  biomarker  for  diagnosing  domoic  acid  toxicosis  in  sea  lions; 
however,  there  are  markers  that  do  perform  well  in  combination  with  other  data. 

Objective  specific  conclusions: 

1)  Proteominer  Ligand  Library  depletion  strategies  for  high  abundance  protein  depletion 
performed  best  for  removing  albumin.  Protein  diversity  was  elevated  following  depletion  and 
these  columns  are  a  reasonable  choice  for  future  studies  requiring  protein  depletion,  although 
Proteome  Lab  IgY  depletion  also  appears  to  perform  well  and  could  be  suitable  alternative 
choice. 

2)  Commercial  cytokine  arrays  developed  for  human  measurements  can  be  utilized  as  a 
screening  tool  offering  >92%  accuracy  in  the  diagnosis  of  acute  DAT,  but  will  only  detect  30% 
of  those  sea  lions  with  DAT.  Alternatively,  these  panels  can  also  exclude  a  diagnosis  of  DAT 
(predict  non-DAT)  with  >98%  accuracy,  but  once  again  will  only  detect  25%  of  the  truly  non- 
DAT  sea  lions. 

3)  MALDI-ToF  peptide  profiling  of  acute  DAT  demonstrated  that  no  single  feature  was  an 
excellent  classifier  of  acute  DAT.  The  best  performing  feature  was  located  at  a  mass  of  3017 
m/z.  External  validation  of  this  feature  as  a  marker  produced  results  similar  to  the  commercial 
cytokine  kit  for  the  diagnosis  of  DAT.  Neural  network  modeling  of  the  MALD1  features 
produced  excellent  models  for  DAT  diagnosis  in  the  training  set,  but  the  validation  performance 
was  limited  in  that  only  30%  of  DAT  sea  lions  would  be  detected  with  100%  accuracy.  Once 
again,  this  would  offer  a  high  level  of  confidence  to  a  suspected  diagnosis  of  DAT  if  the  test  was 
positive,  but  a  negative  result  would  not  offer  no  confidence  to  exclude  DAT.  On  the  other  hand, 
one  neural  network  model  performed  well  for  exclusion  of  DAT  with  100%  accuracy  for  60%  of 
the  non-DAT  sea  lions.  This  model  could  offer  a  high  level  of  confidence  when  screening  sea 
lions. 

4)  2D  gel  electrophoresis  studies  of  chronic  DAT  sea  lions  demonstrated  that  Apolipoprotein  E 
was  the  major  classifier  for  DAT  vs.  non-DAT  sea  lions.  Low  levels  of  this  protein  correlated 
with  DAT,  but  alone  were  only  good  biomarkers.  When  Apolipoprotein  E  levels  and  eosinophil 
counts  were  combined  in  a  neural  network  model  or  by  decision  tree,  the  AuROC  was  equal  to  1 
in  the  training  set.  External  test  set  validation  supported  this  combination  as  a  biomarker  with  a 
test  sensitivity  of  86%  and  specificity  of  85%.  This  combination  perfonned  the  best  of  all 
markers  in  this  study. 

5)  Proteomic  analysis  of  cerebral  spinal  fluid  resulted  in  the  identification  of  6  proteins  that 
could  classify  acute  or  chronic  DAT  and  non-DAT  perfectly  in  a  very  small  number  of  animals. 
These  data  were  not  validated  due  to  lack  of  samples  from  sea  lions.  Notably,  the  protein  Reelin 
was  depressed  in  sea  lions  with  DAT  which  fits  a  phenotype  found  in  neurological  disorders  and 
could  be  a  therapeutic  target  in  future  studies. 


iTRAQ  analysis  of  serum  proteins  using  tandem  mass  spectrometry  was  limited  in  its  ability  to 
discriminate  sea  lions  based  on  DAT  vs.  non-DAT.  Serum  amyloid  A  was  statistically  elevated 
but  validation  of  this  protein  as  a  marker  was  unsuccessful. 

7.  Significance 

Domoic  acid  toxicosis  is  a  major  cause  of  death  in  sea  lions  along  the  Pacific  coast.  The  toxin 
responsible  for  this  toxicosis  is  excreted  by  the  body  very  quickly  making  this  a  very  difficult 
disease  to  diagnose.  The  Navy  marine  mammal  program  maintains  sea  lions  in  localities  where 
domoic  acid  is  prevalent;  therefore,  monitoring  of  toxicosis  related  to  domoic  acid  is  important. 

Data  presented  indicate  that  protein  biomarkers  tor  domoic  acid  toxicosis  in  minimally  invasive 
samples  (plasma  or  serum)  do  exist,  but  that  most  of  these  markers  have  either  low  sensitivity  or 
specificity  therefore  making  these  tests  limited  in  application.  Apolipoprotein  E  together  with 
eosinophil  count  appears  to  be  the  best  performing  marker  with  very  good  sensitivity  and 
specificity.  Alone,  eosinophils  or  apolipoprotein  E  do  not  perform  as  well  when  analyzed  alone 
versus  when  analyzed  in  combination.  For  a  disease  like  domoic  acid  toxicosis,  where  variations 
in  dose  and  timing  are  highly  variable,  it  appears  that  combining  routine  clinical  data  with  target 
protein  abundance  is  the  better  method  by  which  to  classify  sea  lions  with  a  highly  variable 
disease  course.  Large  scale  assay  validation  for  apolipoprotein  E  could  provide  an  effective 
method  to  screening  sea  lions  along  with  a  common  clinical  laboratory  value. 

Although  not  the  main  intention  of  the  study,  it  is  interesting  that  both  plasma  Apolipoprotein  E 
and  cerebral  spinal  fluid  Reelin  levels  were  found  to  be  lower  in  sea  lions  with  domoic  acid 
toxicosis.  From  a  mechanistic  point  of  view,  both  proteins  share  a  common  receptor  in  ApoE 
receptor  2  which  is  known  to  inhibit  granular  cell  dispersion  in  the  hippocampus  of  mice  treated 
with  a  drug  similar  to  domoic  acid.  Treatments  that  elevate  both  proteins  could  be  relevant 
targets  for  therapeutic  intervention. 
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